Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable and fix additional build warnings #655

Merged
merged 5 commits into from
Jan 10, 2025
Merged

Conversation

NWilson
Copy link
Member

@NWilson NWilson commented Jan 7, 2025

Part of #651

  • Use much stricter windows warnings (/W3 rather than /W1). This requires quite a few fixes for all the sloppy places where we do implicit assignment of 64-bit values to 32-bit storage.
  • Use and test CMake build & install on FreeBSD and Solaris
  • Add 64 bit Solaris build (cc -m64) and fix existing Solaris warnings
  • Make compile flags used in CI consistent across platforms. Previously Mac & Linux were building with different warning flags.
  • Add --enable-Werror to configure.ac. This means that you can build with -Werror in a clean way. Previously, you had to hackily override the CPPFLAGS when calling make since you can't pass -Werror as a CFLAG into ./configure (it messes with compiler feature detection).

@NWilson NWilson force-pushed the user/niwilson/more-warnings branch from c677841 to 989f5ae Compare January 7, 2025 12:44
Comment on lines +1266 to +1267
if (op->one_char > 0) snprintf(s, sizeof(s), "-%c,", op->one_char);
else snprintf(s, sizeof(s), " ");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a fear of sprintf, strcpy and strcat. Perhaps we should ban them. All the current uses within the code seem harmless though - but they can go so badly wrong.

@NWilson NWilson force-pushed the user/niwilson/more-warnings branch from 989f5ae to 327bd0c Compare January 7, 2025 17:41
maint/manifest-cmakeinstall-freebsd Show resolved Hide resolved
src/pcre2_jit_compile.c Outdated Show resolved Hide resolved
@@ -3380,7 +3381,7 @@ OP2(SLJIT_SUB | SLJIT_SET_Z, COUNT_MATCH, 0, COUNT_MATCH, 0, SLJIT_IMM, 1);
add_jump(compiler, &common->calllimit, JUMP(SLJIT_ZERO));
}

static SLJIT_INLINE void allocate_stack(compiler_common *common, int size)
static SLJIT_INLINE void allocate_stack(compiler_common *common, sljit_sw size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size is a small number. Does it need to eliminate warnings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things like strlen, or subtraction of pointers, return a size_t, which is then passed to allocate_stack(). So I assumed that it could indeed be passed a size_t input, and that the various operations below would be able to handle it. But if it's a stack allocation, I guess that's not correct anyway, and the caller does need to sure it's only passing in "small" values.

I'll change it back, and update the callers not to cast away any large values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zherczeg What's the maximum amount you can pass to allocate_stack? There's one caller which passes in the callout_arg_size, which is 2 * sizeof(void*) * top_bracket. Because you can have 65536 capture groups, that's an entire 1MiB of data plus some extra few bytes.

Is the stack able to support enough for that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I mixed this function up with sljit_alloc_memory. That is designed for small allocations.

This function should not be limited, although with very large numbers we could do a carry flag check.

src/pcre2_jit_neon_inc.h Show resolved Hide resolved
@@ -1838,7 +1838,8 @@ if (*next == OP_BRAZERO || *next == OP_BRAMINZERO)

if (i == max)
{
common->private_data_ptrs[max_end - common->start - LINK_SIZE] = next_end - max_end;
/* Patterns must fit into an int32 even for link-size=4. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment I have just added is correct.

I was worried about whether there's any case where a pattern can by 3GiB big (ie, bigger than an S32 can hold, but smaller than the U32 maximum for a pattern with link-size=4).

I can see there's some nice code elsewhere:

#define MAX_PATTERN_SIZE (1 << 30) /* Keep it positive */

("Keep it positive" clearly means, "don't use 1<<32" as the max size, but make sure it's comfortably smaller than that.)

This apparently ensures that you can't get in this situation, where you have a pattern whose compiled size is too large to fit in an int.

@@ -2460,7 +2461,7 @@ status->compiler = common->compiler;
}

static void delayed_mem_copy_move(delayed_mem_copy_status *status, int load_base, sljit_sw load_offset,
int store_base, sljit_sw store_offset)
int store_base, int store_offset)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The store_offset arg is placed in the status->store_offsets array, which is already an int in the current code. So I think it's safe to make the argument an int (and hence avoid truncation when assigning to status->store_offsets).

That just moves the problem elsewhere though: now we need to worry about truncating to a 32-bit int in all the code which calls delayed_mem_copy_move.

@@ -3184,7 +3185,7 @@ while (cc < ccend)
SLJIT_ASSERT(private_srcw[i] != 0);

if (!from_sp)
delayed_mem_copy_move(&status, base_reg, stackptr, SLJIT_SP, private_srcw[i]);
delayed_mem_copy_move(&status, base_reg, stackptr, SLJIT_SP, (int)private_srcw[i]); // XXX @zherczeg: cast OK?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zherczeg I have added three casts here. I cannot verify whether these are OK. They are all truncation of an sljit_sw to an int.

It appears that all of the data placed inside kept_shared_srcw is within the range of an int; but private_srcw and shared_srcw are harder to check.

I'm hoping you "just know" already that these assignments are safe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sljit_s32 could be better than int. The casts are ok, this should copy between (machine) stack and (backtracking) stack :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for helping me out!

I have changed some uses of int to sljit_s32.

I think the changes to pcre2_jit_compile in this PR are now simply formatting - all the variables have the same type as before (since int and sljit_s32 are actually the same), and we've just added explicit casts for things that were previously implicit casts. So the runtime code should be identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed some uses of int to sljit_s32.

historically (in the long obsolete PCRE1 codebase) all the interfaces used int and some might had not been updated since, so is usually safe to make this "conversion", but should also probably look for variables that were really meant to become PCRE2_SIZE, or in the case of JIT one of sljit_sw/sljit_uw.

Copy link
Member Author

@NWilson NWilson Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Carlo, I did look for cases where it should be PCRE2_SIZE or sljit_sw. That's what the discussion was about. I only changed these specific values to sljit_s32, which Zoltan commented on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PCRE1 used int because, being started in 1997, it was written to the C90 standard, in which int32_t etc did not exist. (Just for the record.)

@NWilson NWilson force-pushed the user/niwilson/more-warnings branch from 9a2231f to bf24f80 Compare January 9, 2025 20:57
@NWilson NWilson requested a review from zherczeg January 10, 2025 10:00
@NWilson
Copy link
Member Author

NWilson commented Jan 10, 2025

I've tagged Zoltan to re-confirm the changes I made in pcre2_jit_compile.c, before I merge.

Copy link
Collaborator

@zherczeg zherczeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NWilson NWilson merged commit acb4b56 into master Jan 10, 2025
48 of 49 checks passed
@NWilson NWilson deleted the user/niwilson/more-warnings branch January 10, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants